首页> 外文OA文献 >GRE: A Graph Runtime Engine for Large-Scale Distributed Graph-Parallel Applications
【2h】

GRE: A Graph Runtime Engine for Large-Scale Distributed Graph-Parallel Applications

机译:GRE:用于大规模分布式图并行的图运行引擎   应用

摘要

Large-scale distributed graph-parallel computing is challenging. On one hand,due to the irregular computation pattern and lack of locality, it is hard toexpress parallelism efficiently. On the other hand, due to the scale-freenature, real-world graphs are hard to partition in balance with low cut. Toaddress these challenges, several graph-parallel frameworks including Pregeland GraphLab (PowerGraph) have been developed recently. In this paper, wepresent an alternative framework, Graph Runtime Engine (GRE). While retainingthe vertex-centric programming model, GRE proposes two new abstractions: 1) aScatter-Combine computation model based on active message to exploit massivefined-grained edge-level parallelism, and 2) a Agent-Graph data model based onvertex factorization to partition and represent directed graphs. GRE isimplemented on commercial off-the-shelf multi-core cluster. We experimentallyevaluate GRE with three benchmark programs (PageRank, Single Source ShortestPath and Connected Components) on real-world and synthetic graphs of millionsbillion of vertices. Compared to PowerGraph, GRE shows 2.5~17 times betterperformance on 8~16 machines (192 cores). Specifically, the PageRank in GRE isthe fastest when comparing to counterparts of other frameworks (PowerGraph,Spark,Twister) reported in public literatures. Besides, GRE significantlyoptimizes memory usage so that it can process a large graph of 1 billionvertices and 17 billion edges on our cluster with totally 768GB memory, whilePowerGraph can only process less than half of this graph scale.
机译:大规模分布式图并行计算具有挑战性。一方面,由于计算模式不规则,缺乏局部性,很难有效地表达并行性。另一方面,由于比例尺的自由性,现实世界的图形很难通过低切来平衡分配。为了应对这些挑战,最近开发了包括Pregeland GraphLab(PowerGraph)在内的几种图形并行框架。在本文中,我们提出了一个替代框架Graph Runtime Engine(GRE)。在保留以顶点为中心的编程模型的同时,GRE提出了两个新的抽象概念:1)基于活动消息的散点合并计算模型,以利用大规模细粒度的边缘级并行性; 2)基于顶点因数分解的Agent-Graph数据模型进行分区和划分。表示有向图。 GRE在现成的商用多核群集上实现。我们在现实世界和数亿亿个顶点的合成图上,使用三个基准程序(PageRank,Single Source ShortestPath和Connected Components)对GRE进行了实验评估。与PowerGraph相比,GRE在8〜16台计算机(192个内核)上的性能提高了2.5〜17倍。特别是,与公开文献中报道的其他框架(PowerGraph,Spark,Twister)相比,GRE中的PageRank最快。此外,GRE显着优化了内存使用,因此它可以处理群集中拥有10亿个顶点和170亿条边的大型图,总共有768GB内存,而PowerGraph只能处理不到此图比例的一半。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号